Fix lazy If branch pruning and skipped-parent handling in graph runtime by JPPhoto · Pull Request #9079 · invoke-ai/InvokeAI

JPPhoto · 2026-04-20T23:56:40Z

Summary

Fixes lazy If execution in invokeai/app/services/shared/graph.py for workflows where a selected branch shares ancestors with the unselected branch.

The runtime bug was that resolving an If node decremented indegree for the pruned input but left the pruned execution edge in place, so later parent completion could decrement the same dependency again. This could make the If node become ready too early, underflow indegree, or propagate None into required downstream inputs.

This PR:

prunes unselected If input edges from the execution graph during branch resolution
ignores skipped prepared exec nodes when matching live parents for downstream materialization
tolerates missing result payloads from skipped branch-local exec nodes during If input hydration
adds focused backend regression tests for positive and negative branch-selection cases
updates invokeai/app/services/shared/README.md to document the runtime behavior

Related Issues / Discussions

Complex workflow failure involving a single If node feeding positive_text_conditioning on flux_denoise.

QA Instructions

Load a workflow where collect -> if -> required downstream input is used and verify:

the selected branch output reaches the downstream node
branch-exclusive unselected ancestors are skipped
no KeyError, indegree underflow, or missing required-connection error occurs from the If runtime path

Merge Plan

Checklist

The PR has a short but descriptive title, suitable for a changelog
Tests added / updated (if applicable)
❗Changes to a redux slice have a corresponding migration
Documentation added / updated (if applicable)
Updated What's New copy (if doing a release after this PR)

Pfannkuchensack · 2026-04-25T09:48:53Z

Findings

Low: invokeai/app/services/shared/graph.py:781-795 _prepare_if_inputs silently swallows the case where a SELECTED branch's source has no entry in results and uses the field's default (typically None). This is a defensive fallback documented in invokeai/app/services/shared/README.md:182-184, but the synthetic test in tests/test_graph_execution_state.py:856-883 (test_prepare_if_inputs_ignores_selected_branch_sources_without_results) only constructs the state by hand (g.executed.add(true_value_exec_id) without writing to g.results); no test exercises a real scheduler trajectory that produces a selected-branch source skipped-but-executed state. The risk is that a future scheduler bug that incorrectly marks a selected-branch source as skipped now propagates None to downstream consumers as silent data loss instead of surfacing as KeyError. To expose this issue, add a test that constructs an actual graph + scheduling sequence (for example interleaved/nested IfInvocation nodes whose branch-exclusivity classifications cause a selected-branch source to be skipped through mark_exec_node_skipped) and asserts whether IfInvocation.value should reasonably be None there. If no such trajectory can be built, document explicitly that this branch is unreachable and replace the silent fallback with a precise RuntimeError so a future regression is visible.

Low: invokeai/app/services/shared/graph.py:366-371 _get_collect_iteration_mappings reads source_prepared_mapping[source_node_id] directly and does NOT apply the new "skipped" filter that _get_prepared_nodes_for_source now applies at line 426-431. Any CollectInvocation whose parent has skipped prepared exec nodes would still wire those skipped exec nodes as inputs, and _build_collect_collection -> _get_copied_result_value (line 747-748) crashes on missing results. In the current branch-exclusivity classification (_node_outputs_stay_in_branch, lines 143-151), a Collect with a skipped parent should itself be classified branch-exclusive, so this is theoretically not reachable; but the asymmetry between Collect parent resolution and other node parent resolution is the kind of inconsistency that breaks the next time the exclusivity logic is refined. To expose this issue, add a test that constructs a graph where a CollectInvocation's item parent is on the unselected branch of an IfInvocation (forcing the parent to be skipped) while the Collect itself is reachable, and asserts the engine does not crash; or harden _get_collect_iteration_mappings to apply the same filter.

Low: invokeai/app/services/shared/graph.py:480-499 get_iteration_node short-circuits with if len(prepared_nodes) == 1: return next(iter(prepared_nodes)) after the new skipped-exec filter. With multiple iterations of an outer iterator, it is possible for exactly one iteration's prepared exec to be live (others skipped) and the function returns that single live exec node WITHOUT verifying it actually matches the caller's prepared_iterator_nodes / parent iteration combination. The two new tests test_get_iteration_node_returns_single_active_prepared_exec_node and test_get_iteration_node_ignores_skipped_prepared_exec_nodes both pass [] for prepared_iterator_nodes, so neither covers the case where iteration-path matching matters. To expose this issue, add a test with an iterator whose body contains an If, run the iterator with iteration_count > 1, resolve different iterations to different branches so only one iteration's branch source remains live, and assert that downstream consumers for the OTHER iteration are not incorrectly bound to the surviving live exec node.

Low: invokeai/app/services/shared/graph.py:255-266 mark_exec_node_skipped calls self._state.executed.add(exec_node_id) and self._state._set_prepared_exec_state(exec_node_id, "skipped") unconditionally. If a node was previously executed normally (entry in results, state == "executed") and a separately-resolved IF later classifies the same prepared exec as part of an unselected branch, the metadata flips to state == "skipped" while results[exec_id] still holds the real output. After the new filter in _get_prepared_nodes_for_source (line 426-431), this exec node would suddenly become invisible to downstream materialization even though its output was real. This requires multiple IfInvocation nodes with overlapping branch-exclusive classifications and is unlikely under the conservative _prune_nonexclusive_branch_nodes algorithm, but no test pins the contract. To expose this issue, add a test that builds two IfInvocation nodes whose true-branch-exclusive sets overlap on a single source node and checks behavior when one IF resolves first (producing results) and the second IF resolves the other way and tries to skip the same prepared exec.

Open Questions

The new fallback in _prepare_if_inputs (graph.py:781-795) is justified in invokeai/app/services/shared/README.md:182-184 as covering "selected branch's upstream exec node was skipped and therefore produced no runtime output". I could not construct a real scheduling trajectory, given the current _node_outputs_stay_in_branch classifier, where this happens for a SELECTED branch. If the authors have a concrete reproducer, that scenario should be added as a regression test; if not, the silent-None fallback should be replaced with an assertion so the unreachable branch becomes self-policing.

invokeai/app/services/shared/graph.py:195-201 _prune_unselected_if_inputs now mutates execution_graph by deleting edges. Other observers of the execution graph (event emitters, persistence/serialization, debugging UI, replay logic) may rely on the original edge set being preserved across an execution. I did not check whether GraphExecutionState (which is persisted as JSON, see model_config at line 1843-1857) is round-tripped during a run; if it is, a snapshot taken after pruning has structurally different edges from one taken before. This is not necessarily a defect but is a contract change worth confirming.

I had the problem that the UI didn't know that the graph had finished rendering. But I saw that you already addressed this in another PR?

JPPhoto · 2026-04-25T11:50:01Z

Low: invokeai/app/services/shared/graph.py:781-795 _prepare_if_inputs silently swallows the case where a SELECTED branch's source has no entry in results and uses the field's default (typically None). This is a defensive fallback documented in invokeai/app/services/shared/README.md:182-184, but the synthetic test in tests/test_graph_execution_state.py:856-883 (test_prepare_if_inputs_ignores_selected_branch_sources_without_results) only constructs the state by hand (g.executed.add(true_value_exec_id) without writing to g.results); no test exercises a real scheduler trajectory that produces a selected-branch source skipped-but-executed state. The risk is that a future scheduler bug that incorrectly marks a selected-branch source as skipped now propagates None to downstream consumers as silent data loss instead of surfacing as KeyError. To expose this issue, add a test that constructs an actual graph + scheduling sequence (for example interleaved/nested IfInvocation nodes whose branch-exclusivity classifications cause a selected-branch source to be skipped through mark_exec_node_skipped) and asserts whether IfInvocation.value should reasonably be None there. If no such trajectory can be built, document explicitly that this branch is unreachable and replace the silent fallback with a precise RuntimeError so a future regression is visible.

Low: invokeai/app/services/shared/graph.py:366-371 _get_collect_iteration_mappings reads source_prepared_mapping[source_node_id] directly and does NOT apply the new "skipped" filter that _get_prepared_nodes_for_source now applies at line 426-431. Any CollectInvocation whose parent has skipped prepared exec nodes would still wire those skipped exec nodes as inputs, and _build_collect_collection -> _get_copied_result_value (line 747-748) crashes on missing results. In the current branch-exclusivity classification (_node_outputs_stay_in_branch, lines 143-151), a Collect with a skipped parent should itself be classified branch-exclusive, so this is theoretically not reachable; but the asymmetry between Collect parent resolution and other node parent resolution is the kind of inconsistency that breaks the next time the exclusivity logic is refined. To expose this issue, add a test that constructs a graph where a CollectInvocation's item parent is on the unselected branch of an IfInvocation (forcing the parent to be skipped) while the Collect itself is reachable, and asserts the engine does not crash; or harden _get_collect_iteration_mappings to apply the same filter.

Low: invokeai/app/services/shared/graph.py:480-499 get_iteration_node short-circuits with if len(prepared_nodes) == 1: return next(iter(prepared_nodes)) after the new skipped-exec filter. With multiple iterations of an outer iterator, it is possible for exactly one iteration's prepared exec to be live (others skipped) and the function returns that single live exec node WITHOUT verifying it actually matches the caller's prepared_iterator_nodes / parent iteration combination. The two new tests test_get_iteration_node_returns_single_active_prepared_exec_node and test_get_iteration_node_ignores_skipped_prepared_exec_nodes both pass [] for prepared_iterator_nodes, so neither covers the case where iteration-path matching matters. To expose this issue, add a test with an iterator whose body contains an If, run the iterator with iteration_count > 1, resolve different iterations to different branches so only one iteration's branch source remains live, and assert that downstream consumers for the OTHER iteration are not incorrectly bound to the surviving live exec node.

Low: invokeai/app/services/shared/graph.py:255-266 mark_exec_node_skipped calls self._state.executed.add(exec_node_id) and self._state._set_prepared_exec_state(exec_node_id, "skipped") unconditionally. If a node was previously executed normally (entry in results, state == "executed") and a separately-resolved IF later classifies the same prepared exec as part of an unselected branch, the metadata flips to state == "skipped" while results[exec_id] still holds the real output. After the new filter in _get_prepared_nodes_for_source (line 426-431), this exec node would suddenly become invisible to downstream materialization even though its output was real. This requires multiple IfInvocation nodes with overlapping branch-exclusive classifications and is unlikely under the conservative _prune_nonexclusive_branch_nodes algorithm, but no test pins the contract. To expose this issue, add a test that builds two IfInvocation nodes whose true-branch-exclusive sets overlap on a single source node and checks behavior when one IF resolves first (producing results) and the second IF resolves the other way and tries to skip the same prepared exec.

I addressed these review concerns with tests first. The selected-branch missing-result path now fails loudly instead of silently propagating None in invokeai/app/services/shared/graph.py, and the README now states that this path should be unreachable in normal scheduling in invokeai/app/services/shared/README.md. Collect parent mapping now filters skipped prepared exec nodes in graph.py, single-live-node iterator matching now still verifies iterator ancestry in graph.py, and mark_exec_node_skipped() no longer hides already executed results by flipping executed nodes to skipped in graph.py.

Tests added/updated in tests/test_graph_execution_state.py cover:

selected If branch with missing result now raising
collect mapping ignoring skipped prepared parents
iterator-path mismatch not reusing the only surviving live exec from another iteration
executed nodes remaining visible even if later considered for skipping

Open Questions

The new fallback in _prepare_if_inputs (graph.py:781-795) is justified in invokeai/app/services/shared/README.md:182-184 as covering "selected branch's upstream exec node was skipped and therefore produced no runtime output". I could not construct a real scheduling trajectory, given the current _node_outputs_stay_in_branch classifier, where this happens for a SELECTED branch. If the authors have a concrete reproducer, that scenario should be added as a regression test; if not, the silent-None fallback should be replaced with an assertion so the unreachable branch becomes self-policing.

The concern about silent None fallback in invokeai/app/services/shared/graph.py is fair. There's not a demonstrated normal scheduler path where a SELECTED branch input should be missing because its upstream exec node was skipped. The current test in invokeai/tests/test_graph_execution_state.py is a defensive synthetic-state test, not a real scheduler reproducer. I addressed the first point by narrowing the fallback documentation and code comment so they no longer imply this is a normal selected-branch scheduler path.

invokeai/app/services/shared/graph.py:195-201 _prune_unselected_if_inputs now mutates execution_graph by deleting edges. Other observers of the execution graph (event emitters, persistence/serialization, debugging UI, replay logic) may rely on the original edge set being preserved across an execution. I did not check whether GraphExecutionState (which is persisted as JSON, see model_config at line 1843-1857) is round-tripped during a run; if it is, a snapshot taken after pruning has structurally different edges from one taken before. This is not necessarily a defect but is a contract change worth confirming.

My plan:

Keep the current pruning behavior for now.
Do not treat this as a retry/resume correctness bug.
Do treat it as a session-snapshot contract change and document it.

Why I think that is the right course:

Saved sessions are explicitly not used to drive live execution:
- invokeai/app/services/session_queue/session_queue_sqlite.py:721-723
Retry does not reuse the persisted execution_graph; it rebuilds a fresh GraphExecutionState from the original source graph only:
- invokeai/app/services/session_queue/session_queue_sqlite.py:977
The frontend code I found uses session.results and session.source_prepared_mapping, not session.execution_graph, for output handling.

So the mutation is real, but the impact appears limited to inspection/debug snapshots of failed/completed sessions, not runtime behavior or retry semantics. If you know of any real consumers of the execution graph, then I can separate
"pruned dependency bookkeeping" from the persisted execution_graph, for example via a private runtime-only
suppression set or a non-serialized field.

I had the problem that the UI didn't know that the graph had finished rendering. But I saw that you already addressed this in another PR?

Correct!

Pfannkuchensack · 2026-04-25T12:11:35Z

Optional Hardening Notes (non-blocking)

These are nice-to-haves on top of an already solid PR — none are defects, just polish.

1. Document idempotency of `mark_exec_node_skipped`

invokeai/app/services/shared/graph.py:255-269

The new early-return only short-circuits on state == "executed". Calling the function a second time on a node that is already "skipped" falls through and re-runs _remove_from_ready_queues, _set_prepared_exec_state, and the source-completion check. The result is idempotent in practice, but the asymmetry between the two terminal states ("executed" short-circuits, "skipped" re-runs) is subtle.

Optional: add a one-line comment noting that re-entry on "skipped" is intentionally idempotent, or short-circuit on either terminal state for symmetry.

2. Minor cost in `get_iteration_node` trivial path

invokeai/app/services/shared/graph.py:480-504

The single-prepared-node shortcut now always computes _get_parent_iterator_exec_nodes before returning, even when prepared_iterator_nodes is empty (in which case _matches_parent_iterators is vacuously true). The previous version returned in O(1) for that case.

Optional: skip the iterator-graph work when prepared_iterator_nodes is empty. Negligible impact unless this is on a hot path.

3. Improve diagnostics on `_prepare_if_inputs` defensive raise

invokeai/app/services/shared/graph.py:793-797

The new RuntimeError("IfInvocation selected input edge points at an exec node with no stored result output") correctly fails loud, but the message lacks the offending if_node.id, the edge source id, and the iteration path. If this branch ever does fire in production, those three values would shorten triage considerably.

Optional: include node.id, edge.source.node_id, and the iteration path in the message.

4. Tighten `test_mark_exec_node_skipped_does_not_hide_already_executed_results`

tests/test_graph_execution_state.py:954-967

The test seeds g.results[exec_id] via AnyTypeTestInvocation(...).invoke(Mock(InvocationContext)) and then asserts g.results[exec_id].value == "value". It works today because AnyTypeTestInvocation.invoke happens to return an output exposing .value, but the seam is incidental — a future change to that test invocation could turn this into a passing-but-misleading assertion.

Optional: construct the BaseInvocationOutput directly instead of routing through .invoke(Mock(...)).

JPPhoto · 2026-04-25T12:23:49Z

@Pfannkuchensack I finished the cleanup pass.

In invokeai/app/services/shared/graph.py, mark_exec_node_skipped() now short-circuits on both terminal states, so skipped re-entry is explicitly idempotent. In invokeai/app/services/shared/graph.py, get_iteration_node() restores the trivial fast path when there are no prepared iterator constraints. In invokeai/app/services/shared/graph.py, the defensive If raise now includes if_exec_id, source_exec_id, and iteration_path.

I also tightened the tests in tests/test_graph_execution_state.py: the missing-result If test now asserts the richer diagnostics, the executed-result test now uses AnyTypeTestInvocationOutput directly instead of going through .invoke(...), and there is a new idempotency test for repeated skip marking.

Pfannkuchensack

Works great.

JPPhoto requested review from blessedcoolant, dunkeroni and lstein as code owners April 20, 2026 23:56

github-actions Bot added python PRs that change python files services PRs that change app services python-tests PRs that change python tests labels Apr 20, 2026

JPPhoto added this to Invoke - Community Roadmap Apr 20, 2026

JPPhoto moved this to 6.13.x Theme: MODELS in Invoke - Community Roadmap Apr 20, 2026

JPPhoto added the v6.13.x label Apr 21, 2026

JPPhoto force-pushed the fix-if-branch-edge-pruning branch 3 times, most recently from 63e5b16 to f681759 Compare April 22, 2026 00:56

JPPhoto requested a review from Pfannkuchensack April 22, 2026 10:48

Fix lazy If branch pruning and skipped-parent handling in graph runtime

3785efb

JPPhoto force-pushed the fix-if-branch-edge-pruning branch from f681759 to 3785efb Compare April 23, 2026 15:57

Pfannkuchensack self-assigned this Apr 25, 2026

Tighten lazy If runtime edge-case handling

fdc0f39

Polish lazy If runtime diagnostics and idempotency

73b67fc

Pfannkuchensack approved these changes Apr 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix lazy If branch pruning and skipped-parent handling in graph runtime#9079

Fix lazy If branch pruning and skipped-parent handling in graph runtime#9079
JPPhoto wants to merge 3 commits intoinvoke-ai:mainfrom
JPPhoto:fix-if-branch-edge-pruning

JPPhoto commented Apr 20, 2026

Uh oh!

Pfannkuchensack commented Apr 25, 2026

Uh oh!

JPPhoto commented Apr 25, 2026 •

edited

Loading

Uh oh!

Pfannkuchensack commented Apr 25, 2026

Uh oh!

JPPhoto commented Apr 25, 2026

Uh oh!

Pfannkuchensack left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

JPPhoto commented Apr 20, 2026

Summary

Related Issues / Discussions

QA Instructions

Merge Plan

Checklist

Uh oh!

Pfannkuchensack commented Apr 25, 2026

Uh oh!

JPPhoto commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Pfannkuchensack commented Apr 25, 2026

Optional Hardening Notes (non-blocking)

1. Document idempotency of mark_exec_node_skipped

2. Minor cost in get_iteration_node trivial path

3. Improve diagnostics on _prepare_if_inputs defensive raise

4. Tighten test_mark_exec_node_skipped_does_not_hide_already_executed_results

Uh oh!

JPPhoto commented Apr 25, 2026

Uh oh!

Pfannkuchensack left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

JPPhoto commented Apr 25, 2026 •

edited

Loading

1. Document idempotency of `mark_exec_node_skipped`

2. Minor cost in `get_iteration_node` trivial path

3. Improve diagnostics on `_prepare_if_inputs` defensive raise

4. Tighten `test_mark_exec_node_skipped_does_not_hide_already_executed_results`